A Hybrid Approach fior Mining Maximal Hyperclique Patterns

نویسندگان

  • Yaochun Huang
  • Hui Xiong
  • Weili Wu
  • Zhongnan Zhang
چکیده

fi A hyperclique pattern [12] is a new type of association pattern that contains items which are highly affiliated with each other. More specifically, the presence of an item in one transaction strongly implies the presence of every other item that belongs to the same hyperclique pattern. In this paper, we present a new algorithm for mining maximal hyperclique patterns, which are desirable for pattern-based clustering methods [11]. This algorithm exploits key advantages of both the Depth First Search (DFS) strategy and the Breadth First Search (BFS) strategy. Indeed, we adapt the equivalence pruning method, one of the most efficient pruning methods of the DFS strategy, into the process of the BFS strategy. As demonstrated by our experimental results, the performance of our algorithm can be orders of magnitude faster than standard maximal frequent pattern mining algorithms, particularly at low levels of support.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Quantitative Maximal Hyperclique Patterns: A Summary of Results

Hyperclique patterns are groups of objects which are strongly related to each other. Indeed, the objects in a hyperclique pattern have a guaranteed level of global pairwise similarity to one another as measured by uncentered Pearson’s correlation coefficient. Recent literature has provided the approach to discovering hyperclique patterns over data sets with binary attributes. In this paper, we ...

متن کامل

Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution

Existing association-rule mining algorithms often rely on the support-based pruning strategy to prune its combinatorial search space. This strategy is not quite effective for data sets with skewed support distributions because they tend to generate many spurious patterns involving items from different support levels or miss potentially interesting low-support patterns. To overcome these problem...

متن کامل

Identification of Functional Modules in Protein Complexes via Hyperclique Pattern Discovery

Proteins usually do not act isolated in a cell but function within complicated cellular pathways, interacting with other proteins either in pairs or as components of larger complexes. While many protein complexes have been identified by large-scale experimental studies, due to a large number of false-positive interactions existing in current protein complexes 10, it is still difficult to obtain...

متن کامل

WIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity

In this paper, we present a new algorithm, Weighted Interesting Pattern mining (WIP) in which a new measure, weight-confidence, is developed to generate weighted hyperclique patterns with similar levels of weights. A weight range is used to decide weight boundaries and an h-confidence serves to identify strong support affinity patterns. WIP not only gives a balance between the two measures of w...

متن کامل

Hybrid ASP-Based Approach to Pattern Mining

Detecting small sets of relevant patterns from a given dataset is a central challenge in data mining. The relevance of a pattern is based on userprovided criteria; typically, all patterns that satisfy certain criteria are considered relevant. Rule-based languages like Answer Set Programming (ASP) seem wellsuited for specifying such criteria in a form of constraints. Although progress has been m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006